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1. INTRODUCTION 

According to statistics, on an average year, Vietnam has about 8,000 deaths, 15,000 injured due to 
traffic crashes. Thus, economic losses are estimated at 5 to 12 billion US $. There are many reasons for 
crashes, such as using the phone to watch movies, listen to music or text, call while driving [1], [2]. Those 
behaviors are very potential to cause crashes. 

CarPlay and Google Android Auto are two intelligent systems of Apple and Android installed on 
automobiles to use the interface's phone features on the screen by car [3]-[5]. These technologies' ultimate 
purpose is to help users use the necessary functions on the phone in the most convenient way, limiting 
manipulation, focusing on the journey. The benefits that Android Auto and Apple CarPlay bring are 
undeniable. However, these two systems are mostly integrated into modern vehicles. There have also been 
studies focusing on safe driving in the past. These studies are mainly focused on smartphone services, built-in 
inertial sensors in the car or on the phone [6], [7]. We want to fulfill other applications' insufficiencies; thus, 
this paper proposes building software installed on all smartphones running Android operating systems to 
support automobile drivers. Our contributions can be summarized as follows: 
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The software automatically recognizes the car's in-car status, automatically turns on Bluetooth to 
support hands-free calling, and turns off Bluetooth to save the battery if it detects the user is no longer riding 
again. Some essential features have been added, such as automatically read aloud short message service 
(SMS messages from Gmail or Messenger. Detect the crash and automatically make emergency contact with 
the relative person with the user's address. It alerts the driver to erratic driving and drives for too long. 
Driving for a long time can cause fatigue and accidents. 

The paper has four sections. After the introduction, we will present the software system's design 
model using the machine learning algorithm and the accelerometer sensor built into the phone and presents 
an algorithm to identify the driver's status in section 2. The results and evaluation of the effectiveness of the 
proposed system are analyzed in section 3. Finally, the conclusions are presented in section 4. 


2. MODEL OF THE SOFTWARE DESIGN SYSTEM 
2.1. System design 

The software system that identifies cars' driving state using machine learning algorithms and a built- 
in accelerometer is designed with the functional principle diagram, as shown in Figure 1. If the "Driving" 
status is identified using a machine learning algorithm, the phone will automatically activate Bluetooth to 
help the driver make the hands-free conversation mode. It automatically receives messages from SMS, 
Gmail, or Messenger. Suppose a crash is expected to occur with the driver. In that case, the application will 
confirm with the user by vibrating vigorously and present a voicemail asking if a crash has occurred or not. It 
avoids issues of false alerts. In 10 seconds, if there is no response by clicking confirmation on the dialog box 
from the user, the software will default to a crash and automatically turn on global positioning system (GPS) 
to get information about the driver's current location. It then automatically send notification messages with 
GPS coordinates and map links to their emergency departments to receive timely help. 


Acquire acceleration 
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Detect the status using 
machine learning algorithm 
Yes No 
No Bluetooth Bluetooth 
Yes 


onfirmation 
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Incoming r ee 
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Figure 1. Diagram of the operating principle of the software system 
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The program can detect the "Driving" status using the acceleration data from the built-in 
accelerometer. However, we did not provide the raw acceleration data for the machine-learning model. We 
extract a feature set from the acceleration data. The entire process of determining the status of the driver is 
shown in Figure 2. 


Acquire acceleration 


Moving window Feature selection 


data in X,Y,Z axes 


Classification 


Assign the 


labels 


Figure 2. The process of classification of driver status 


2.2. Collect data from the accelerometer 

The sensor provides 50 samples of the acceleration data every second along the three axes [6]. After 
conducting data collection from the accelerometer sensor [7], we used a 14-second sliding window to extract 
the designed features. The monitor data is then labeled in a data set. Table 1 shows an example of 
observations. The dataset includes 2560 direct observations distributed to the state shown in Table 2. 


Table 1. An observation sample along three axes X, Y, and Z 


Driving 
Acceleration in X (g) Acceleration in Y (g) Acceleration in Z (g) 
- 0.0867 1.2008 9.9403 
0.1540 0.7493 9.9128 
0.6307 0.1791 9.8924 
0.3528 - 0.3694 10.7692 
- 0.1083 - 0.2652 10.6853 
0.3768 0.2378 9.1378 


Table 2. The number of samples observed for each state 
Number of samples 


Status 


observed 
On Vehicle 581 
On Bicycle 308 
On Foot 669 
Still 899 
Tilting 103 
Total 2560 


2.3. Feature selection 

This paper proposes and conducts a feature selection because the classifier can provide better 
classification results [8]. t-distributed stochastic neighbor embedding (t-SNE) is a non-linear technique to 
map multidimensional data to lower dimensional space [9]. This study uses the t-SNE technique to map each 
data point (X, Y, Z) in 3-dimensional space to 2-dimensional space for visualization, easy for data 
observation. Figure 3 shows the distribution of the training data set without natural selection in 2-dimensional 
space using the t-SNE technique. The data set does not seem to be stratified when states overlap and overlap. 
Thus, it is clear that the pretreatment process's influence and natural selection affect the model's classification 
performance [10], [11]. 

Figure 4 shows a better result so far in terms of the 5-state classification when each state has been 
clearly separated, leaving only a small amount of data mixed into other classes. Thus, the characteristic 
selection was conducted and brought outstanding results to classify five states when selecting four 
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characteristics, including average, median, RMS, and range. Table 3 presents a summary of the chosen 
features from where Xis the set of acceleration samples, N is the number of samples, and x; the value of the 
i™ sample in the set X. 
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Figure 3. Performing t-SNE collective training data before choosing characteristic 
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Figure 4. Performing t-SNE using four chosen features 
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Table 3. The feature formulas 


Features Formula 
Mean Mean (X) =< whi x 
i Median (X ) = + 
Median edian (4 ) = 3 C + xaa) 
1 
RMS RMSy = [$ EM x 


Range Range (X ) = [min {x;}, max! {x:}] 


2.4. State classification algorithm 

Referring to the activity classification service Google activity classification [12], we collect five 
states' data and design a machine learning model to accurately classify those five states from the original 
dataset, including on vehicle, on bicycle, on foot, still, tilting. The activities are described in detail in Table 4. 
Four states (on bicycle, on foot, still, tilting) are grouped to a final state, "No driving." From the input 
featured data, 60% of the data is for training, and the remaining is for testing. These data sets go through the 
feature selection step to find the best features for the data set. In this study, we used the gradient boosting 
decision tree (GBDT) method [13]. GBDT can be used in the classification problem due to its strong 
generalization ability [14], [15]. 


Table 4. States description 


Status based on [12] Descriptions based on [12] Status in our study 
On vehicle The equipment is in the vehicle (car, bus, taxi, truck). The user can be a Driving 
driver or a passenger. 
On bicycle The user is on a bicycle. 
On foot The user is lying, running or walking. 
Still The device is still in the previous state (stay still, do not move) No drivin 
Tilting The angle of the device relative to gravity is much changed. This state E 


usually occurs when picking up the device from the bottom up (from the 
table up, from the ground up) or in the user's pocket while learning to sit up. 


3. RESULTS AND DISCUSSION 

To clarify the proposed software solution's effectiveness in determining the driver's state using data 
from the accelerometer sensor built into smartphones and the decision tree (DT) machine learning algorithm, 
we conducted a review. The overall performance of the DT model reaches 95.1% for all classified states. 
Tables 5 and 6 list the performance parameters for each state. In particular, the states on vehicle, still are 
classified accurately to a high level. Overall accuracy (accuracy) is suitable for all classes (greater than 90%). 
The best precision has been achieved with on vehicle, still, and on foot; the remaining states reach the right 
level with a slightly lower value. Sensitivity is very high, meaning there are not many negative cases that are 
misclassified as positive. The demo clip of our application is found at the link https://youtu.be/ZbbJ 1nmHhvE. It 
can be seen that our application software can contribute to improving transportation safety [16]-[25]. 


Table 5. The confusion matrix 


Observed Predicted States Total 
States On Vehicle On Bicycle OnFoot Still Tilting 
On vehicle 226 1 0 0 0 227 
On bicycle 5 115 36 0 0 156 
On foot 0 0 230 2 0 232 
Still 0 0 0 356 0 356 
Tilting 0 6 0 0 40 46 
Total 231 122 266 358 40 1017 


Table 6. Algorithm performance evaluation parameters for each state 


Status Algorithm performance evaluation parameters 
Accuracy Precision Recall 
On vehicle 99.4% 99.5% 97.8% 
On bicycle 95.3% 73.7% 94.3% 
On foot 96.2% 99.1% 86.5% 
Still 99.8% 100% 99.4% 
Tilting 99.4% 87% 100% 
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3.1. Evaluate the level of energy consumption 
Smartphones from Samsung, Huawei, HTC, LG, or Sony all have to face device battery life 
problems. Android smartphone batteries rarely exceed 36 hours of use. The recent commercial campaigns of 
Samsung Galaxy show that Samsung has operated in super energy saving mode. When you activate this 
mode, your phone will change the color screen to a black and white screen. It also limits the number of apps 
that you can use. In our study, we worked with the phones in Table 7 to evaluate the energy consumption and 
proposed some solutions as follows: 
- Although the information is also extracted from GPS, but only automatically activates this device when a 
crash occurs; In other normal operations, GPS does not need to be turned on. 
- The sensor's sampling rate in the phone can be reduced (while avoiding unnecessary, constant state 
transitions). 
- Users can start and stop using the software manually to save energy. 


Table 7. Algorithm performance evaluation parameters for each state 


No. Model Consumption when using the software Consumption when using the 
(with GPS), calculated by the software (not using GPS), calculated 
decreasing percentage per day (%) by the decreasing rate per day (%) 
1 Oppo A37 35.6 25.0 
2 Oppo A71 34.7 23.1 
3 Oppo A83 32.5 22.4 
4 Oppo F5 30.3 21.6 
5 Samsung J7 35.3 25,5 
6 Samsung J7 Plus 34.5 23.8 
7 Samsung A8 32.6 22.7 
8 Samsung Galaxy A9 Pro 30.8 21.9 
9 Samsung S6 27.3 20.1 
10 Samsung S7 26.2 20.3 
11 Samsung Galaxy S8 25.4 20.2 
12 Xiaomi Note 5A 35.1 25.4 
13 Xiaomi 4X 34.0 23.6 
14 Xiaomi Note 4 32.3 22.5 
15 Xiaomi Mi Al 30.6 21.7 
16 Sony Xperia L2 35.7 25.6 
17 Sony Xperia M5 34.8 23.8 
18 Sony Xperia XA1 32.1 22.9 
19 Sony Xperia XZ1 30.2 21:7 
20 Lenovo K6 Power 32.3 22.4 
21 Lenovo K6 Note 30.5 21.3 


4. CONCLUSION 

This paper built an Android application to assist the driver of safe cars. We focused on studying the 
specific features based on the acceleration data. We used the GDBT algorithm to classify data into five states: 
on bicycle, on foot, still, on vehicle and tilting. Our machine learning model in the proposal makes the overall 
model accuracy reach 96.5%. This result outperformed the overall accuracy of 93% in the previous work. 
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